Syntactic parsing of chat language in contact center conversation corpus

نویسندگان

  • Alexis Nasr
  • Géraldine Damnati
  • Aleksandra Guerraz
  • Frédéric Béchet
چکیده

Chat language is often referred to as Computer-mediated communication (CMC). Most of the previous studies on chat language has been dedicated to collecting ”chat room” data as it is the kind of data which is the most accessible on the WEB. This kind of data falls under the informal register whereas we are interested in this paper in understanding the mechanisms of a more formal kind of CMC: dialog chat in contact centers. The particularities of this type of dialogs and the type of language used by customers and agents is the focus of this paper towards understanding this new kind of CMC data. The challenges for processing chat data comes from the fact that Natural Language Processing tools such as syntactic parsers and part of speech taggers are typically trained on mismatched conditions, we describe in this study the impact of such a mismatch for a syntactic parsing task.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Web Chat Conversations from Contact Centers: a Descriptive Study

In this article we propose a descriptive study of a chat conversations corpus from an assistance contact center. Conversations are described from several view points, including interaction analysis, language deviation analysis and typographic expressivity marks analysis. We provide in particular a detailed analysis of language deviations that are encountered in our corpus of 230 conversations, ...

متن کامل

An improved joint model: POS tagging and dependency parsing

Dependency parsing is a way of syntactic parsing and a natural language that automatically analyzes the dependency structure of sentences, and the input for each sentence creates a dependency graph. Part-Of-Speech (POS) tagging is a prerequisite for dependency parsing. Generally, dependency parsers do the POS tagging task along with dependency parsing in a pipeline mode. Unfortunately, in pipel...

متن کامل

برچسب‌زنی خودکار نقش‌های معنایی در جملات فارسی به کمک درخت‌های وابستگی

Automatic identification of words with semantic roles (such as Agent, Patient, Source, etc.) in sentences and attaching correct semantic roles to them, may lead to improvement in many natural language processing tasks including information extraction, question answering, text summarization and machine translation. Semantic role labeling systems usually take advantage of syntactic parsing and th...

متن کامل

Vocabulary Lists for EAP and Conversation Students

Despite the abundance of research investigating general and academic vocabularies and developing dozens of word lists, few studies have compared academic vocabulary with general service word lists such as conversation vocabulary. Many EAP researchers assume that university students need to know all the words in West’s (1953) General Service List (GSL) as a prerequisite to academic words (e.g., ...

متن کامل

Parsing Arabic Dialects

The Arabic language is a collection of spoken dialects with important phonological, morphological, lexical, and syntactic differences, along with a standard written language, Modern Standard Arabic (MSA). Since the spoken dialects are not officially written, it is very costly to obtain adequate corpora to use for training dialect NLP tools such as parsers. In this paper, we address the problem ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016